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ABSTRACT 


'  The  structure  of  second-order  processes  is  exposed  by 
specification  of  whitening  filters  and  modeling  filters,  or 
equivalently  by  Cholesky  decompositions  of  the  covariance  matrix 
and  its  inverse.  We  shall  show  that  these  filters  can  be  obtained  as 
a  cascade  of  lattice  sections,  each  specified  by  a  single  so-called 
reflection  coefficient  parameter.  For  stationary  processes,  the 
reflection  coefficient  will  be  time-invariant.  For  nonstationary 
processes  we  can  use  the  displacement  rank  concept  either  to  find 
a  simple  time-update  formula  for  the  reflection  coefficients  or  to 
replace  them  by  a  time-invariant  vector  reflection  coefficient  of 
size  governed  by  the  displacement  rank  of  processes. 


These  results  are  obtained  in  a  quite  direct  way  by  using  a 
geometric  (Hilbert-space)  formulation  of  the  problem,  combined 
with  old  results  of  Yule  (1907)  on  update  formulas  for  partial 
correlation  coefficients  and  of  Schur  (1917)  and  Szego  (1939)  on 
the  classical  moment  problem... 
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1.  INTRODUCTION 


In  recent  years  there  has  been  considerable  interest  in  lattice  filters  for 
signal  processing.  Though  such  structures  had  been  well  studied  in  network 
theory  as,  for  example,  in  the  wave  digital  filters  of  Fettweis  (see,  e.g..  Fettweis 
et  al.  (1974))  and  especially  in  the  cascade  synthesis  of  multiport  networks 
(Dewilde  (1969)),  the  first  applications  in  on-line  adaptive  signal  processing  were 
apparently  made  by  Itakura  and  Saito  (1971)  in  the  field  of  speech  analysis.  Lat¬ 
tice  filter  models  were  also  familiar  in  geophysical  signal  processing  as  'layered 
earth  models'  (see,  e.g.,  Robinson  (1967));  Burg  ((1970),  unpublished)  explicitly 
suggested  lattice  filter  models  for  the  implementation  of  a  spectral  estimation 
technique  based  on  the  maximum  _ntropy  principle  (Burg  (1967),  see  also  Alam 
(1978)).  The  lattice  filters  used  in  these  works  were  based  on  the  assumption  of 
an  underlying  stationary  process,  and  though  in  fact  the  filters  were  applied  to 
deterministic  and  nonstationary  processes,  it  was  believed  that  then  the  lattice 
solutions  were  suboptimal  (see  e.g..  Makhoul  (1977)).  Morf  and  Vieira  (see  Morf, 
Vieira  and  Lee  (1977),  Vieira  (1977))  were  the  first  to  show  that  the  optimal  solu¬ 
tions  to  certain  (so-called  prewindowed)  nonstationary  model  fitting  problems 
could  be  obtained  in  lattice  form,  though  with  time-variant  lattice  parameters 
(reflection  coefficients).  These  lattice  filters  demonstrated  excellent  tracking 
capability  and  rapid  convergence  (see,  e.g.,  Morf  and  Lee  (1978)  (1979),  Satorius 
and  Pack  (19B1).  Hodgkiss  and  Presley  (1981)).  A  recent  thesis  by  Lee  (1980) 
and  a  paper  by  Lee,  Morf  and  Friedlander  (1981)  present  a  number  of  efficient 
normalized  versions  of  the  prewindowed  lattice  (which  they  call  ladder)  algo¬ 
rithms.  A  forthcoming  paper  by  Porat,  Friedlander  and  Morf  (1982)  presents  a 
similar  comprehensive  analysis  of  the  so-called  "covariance"  lattice  algorithms. 

In  Section  3  of  this  paper  we  shall  show  that  these  results  on  the  pre¬ 
windowed  and  covariance  algorithms  can  be  embedded  into  a  more  general 
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theory  of  time-variant  lattice  filters  for  nonstationary  stochastic  processes.  We 
shall  start  in  Section  2  by  showing  that  lattice  filters  can  be  set  up  for  determin¬ 
ing  the  innovations  of  any  (second-order)  process;  however,  the  computational 
effort  is  0(NS),  where  N  is  the  number  of  observed  random  variables,  and  this 
is  the  same  as  for  any  other  method  (e  g.,  direct  Gram-Schmidt  orthogonaliza- 
tion)  of  determining  the  innovations.  Of  course,  the  lattice  structure  has  vari¬ 
ous  advantages:  modularity,  local  interconnections,  simple  checks  for  stability, 
reduced  sensitivity  to  parameter  and  roundoff  errors.  We  may  note  that  some  of 
these  nice  properties  arise  from  the  fact  that  the  lattice  algorithms  are  a  form 
of  Modified  Gram-Schmidt  algorithm,  which  is  known  to  have  better  numerical 
properties  than  the  direct  Gram-Schmidt  procedure  (see,  e  g.,  Stewart  (1973)). 
However,  our  major  point  is  that  order  of  magnitude  reductions  in  the  computa¬ 
tional  burden  can  be  obtained  by  introducing  the  concept  of  displacement  rank 
as  a  measure  of  nonstationarity;  in  particular,  we  shall  use  the  displacement 
rank  to  obtain  a  general  time-update  formula  for  the  reflection  coefficients.  The 
derivation  is  quite  direct  and  elegant,  calling  upon  an  old  identity,  due  to  Yule 
(1907),  on  partial  correlation  coefficients  and  a  geometric  characterization,  due 
to  Delosme  and  Morf  (1980),  of  displacement  rank  via  so-called  "pinning  vec¬ 
tors",  which  were  introduced  by  Sidhu  and  Kailath  (1975)  in  connection  with  the 
Chandrasekhar  equations  (Kailath  (1973),  Kailath,  Sidhu.Morf  (1973)).  While 
Inputs  from  a  number  of  people  (especially  D.  T.  L.  Lee,  M.  Morf,  D.  R.  Morgan.  J- 
M.  Delosme)  contributed  to  the  development,  our  presentation  follows  one  put 
together  by  H.  Lev-Ari,  and  to  appear  in  his  thesis  (Lev-Ari,  (1982)). 

In  the  final  section  of  this  paper,  we  shall  show  how  the  displacement  rank 
can  be  exploited  to  obtain  a  different  kind  of  simplification-a  time-invariant  lat¬ 
tice  filter  but  with  somewhat  more  complicated  sections,  having  a-1  delays 
per  section  rather  than  just  one,  where  a  is  the  displacement  rank.  This  result 
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was  first  obtained  as  a  consequence  of  the  generalized  Levinson  algorithm 
derived  by  Friedlander  et  al.  (197B),  (1979).  The  original  derivation  was  quite 
lengthy  and  relied  heavily  on  insights  from  the  state-space  Chandrasekhar  equa¬ 
tions;  in  the  meantime,  through  contributions  from  Dewilde,  Vieira,  Porat.  Genin, 
Delosme,  Dym  and  others,  and  especially  Lev-Ari,  a  much  clearer  view  has 
emerged.  Certain  classical  results  of  Schur  on  tests  for  the  positivity  of  moment 
matrices,  when  combined  with  the  displacement  rank  concept,  turn  out  to  pro¬ 
vide  a  simple,  yet  general  and  insightful,  view  of  the  topic  We  shall  present  an 
outline  of  this  approach  in  Section  4,  while  a  detailed  exposition  and  further 
results  will  appear  in  the  thesis  of  H.  Lev-Ari  (1982). 


2.  LATTICE  RECURSIONS  TOR  GENERAL  NON  STATION  ARY  PROCESSES 

We  shall  be  studying  an  indexed  (by  time)  collection  of  "vectors" 
(yit  i=0,l . (,...}  in  some  Hilbert  space,  with  given  inner  products 

<yx.yj>  =  tfij  •  (i) 

and  such  that  the  Gramian  matrix 

R  =  to.,]  (2) 

is  a  symmetric  positive  definite  matrix.  We  generally  have  in  mind  that  the  J 
are  random  variables  in  a  probability  space  where  the  inner  product  is  defined 
by  expectation. 

<Vi.Vi>  =  tf&iy/ }  ■ 

where  the  asterisk  denotes  complex  conjugation  (or  Hermitian  transpose, when 
applied  to  matrix  quantities).  However,  this  stochastic  interpretation  is  not  at 
all  necessary. 

We  shall  denote 

I  1  Vi  1 1 8  :=  <Vi.yx> 

and  shall  use  bars  to  denote  normalized  quantities,  e  g., 

Vi  =  I  IVi  I  )_1Vi  •  (3) 


The  structure  of  the  collection  of  vectors  will  be  explored  by  studying 
the  family  of  finite-order  residuals 

em.t  =yt  ~yt\'j  .  (4) 

where 

V  =  \Vt-m yi~  i?  (5) 

and 
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y  1 1 7  =  the  projection  of  yt  on  the 
space  spanned  by  the  set  U  . 


(6) 


We  shall  call 


em.t  -  the  m-th  order  forward  residual  at  time  t  .  (7) 

We  note  that  the  residuals 


Bfj  .  t  —  0,1,2,... 

will  be  the  so-called  innovations  process  associated  with  the  \yt  ,t  =0,1,... J  which 
is  usually  the  chief  object  of  attention  in  least-squares  estimation  theory;  to 
obtain  lattice  filters,  however,  it  is  convenient  to  imbed  these  desired  quantities 
in  the  larger  family  of  residuals  \em  t,  m<t ,  t  =0,1, ...j. 

The  structure  of  this  family  can  be  exposed  by  first  seeking,  for  each  fixed 
t ,  how  to  determine  order-updates  of  these  residuals,  i.e.,  knowing  em  t  we  shall 
try  to  determine 

em+u  =Vt  -V  t|l  v.vt-^-,1 


in  some  convenient  way.  It  is  reasonable  to  seek  to  use  our  knowledge  of  em  t 
by  making  the  orthogonal  decomposition 

W.y'-m-ll  =  v  © 


where 


'=  Vt-m-1  y  t  -m-l|  U  ■ 

the  m-th  order  backward  residual  at  t  -  m  —  1  . 


(B) 


Then  we  can  write 

y  t  UC/.*,-*,-,!  -Vuv+Vt  |rTO 

=  V  t\V  +  •  (9) 

It  follows  that  we  can  write 
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~  em.t  ~ 

=  em.t  ~  ■rm,t  -1 

=  I  I  em.t  I  I  iem.t  ^em.t 
~  I  I  em.t  I  I  (em.<  ~ 

where  we  used  the  fact  that  rm  t_,  4-  U  to  obtain  the  second  equality,  and  where 
we  defined 

^ m+l.i  <em.t  •  (10) 

For  reasons  given  later,  such  quantities  will  be  called  "reflection  coefficients". 

To  also  normalize  em4.l  t,  we  need  to  compute  its  norm,  for  which  we  note 
that  the  last  equality  yields 

I  +  !  |2  =  I  I  em.l  I  l(^  —  I  em.t  I  I  * 

so  that 

I  I +  I  I  =  I  I em.i  I  l(/  —  fcra  +  l.«fcm  +  l,<)1/2  (11) 

Then  we  can  rewrite  the  order-update  formula  for  the  forward  residuals  in  nor¬ 
malized  form  as 

=  (^  1/Z(®m,t  "  (12) 

This,  of  course,  leaves  us  with  the  problem  of  getting  But  a  similar 

recursion  can  be  set  up  for  it.  Thus  and  more  briefly,  we  can  write 

rm+l.<  =  Vt-m-1  —  7,v(j 

—  1  <V< -ra  —  1  >^m.t 

I  —  ll  I  (^m.J  -l  ^®rn.J ) 

=  I  lrm.t  -1 1  I  (rm.J  -l  ~  km  +  l.t®m,l) 

This  yields 

I  lrm  +  l.t  I  I  =  I  1  I  (/  —  + 

so  that 
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~  {I  +  (13) 

The  recursions  (12)  and  (13)  can  be  pictorially  represented  as  in  Figure  1(a)  as 
a  typical  "lattice”  section.  Moreover,  we  see  that  we  can  put  the  sections 
together  as  in  Figure  1(b),  to  get  a  cascaded  structure,  with  each  lattice  section 
specified  by  a  reflection  coefficient, 

fcjn  +  l.t  ~  t^m ,t  —  ^  —  0,1,...  . 


(d-k‘) 


(d- k') 


^•»*i  t 


figure  l(a).  lattice  Section 


Figure  1(b).  Lattice  Filter  with  Time-Varying  Sections 


Note  that  the  inputs  to  the  first  section  are 

e0 .1  =  Vt  =  1 1 Vt  1 1 -13/<  =  r0.t  (14) 

and  that  the  first  section  has  reflection  coefficient 

k  i.t  =  <eo,t  •ro.t-i>  —  =  Rt.t/zRt.t- (15) 

The  name  reflection  coefficient  is  partly  justified  by  the  fact  that,  by 
Schwarz’s  inequality, 

ll*m«.tll  I  |em.t !  I  I  |rm.»-il  T  =  1  .  (16) 

The  Stationary  Case 

A  more  detailed  justification  is  obtained  from  closer  study  of  the  "station¬ 
ary''  case,  where  stationarity  means  that  the  inner  products  are  invariant  under 
shift  in  the  indices,  i.e., 

<yi-Vi>  =  .  *  =  1.2....  (17) 

In  this  case,  it  follows  that  the  reflection  coefficients  }  are  independent  of 

t ,  so  that  we  have  a  cascade  lattice  filter  with  constant  or  time-invariant  sec¬ 
tions.  Physical  realizations  of  such  time-invariant  filters  (and  in  particular  of 
their  inverses)  have  been  studied  in  some  detail  in  geophysics  (see,  e  g.,  Robin¬ 
son  (1967),  Claerbout  (1976)),  and  in  speech  analysis  (see,  e.g.,  Wakita  (1973), 
Market  and  Gray  (1976)),  and  provide  the  real  justification  for  the  name 
reflection  coefficients.  We  shall  not  pursue  this  further  here,  but  may  refer  to 
Paper  No.  by  Benveniste  earlier  in  this  volume. 

Simplifications  in  the  Nonstationary  Case 

In  the  nonstationary  case,  when  (17)  does  not  hold,  it  seems  unavoidable 
that  the  reflection  coefficients  must  be  time-variant.  However,  it  is  reasonable 
that  the  complexity  of  the  time-variation  should  depend  upon  the  degree  of  non- 
stationarity,  measured  in  some  sense  It  turns  out  that  the  concept  of 
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displacement  rank  provides  a  meaningful  way  of  classifying  nonstationary 
processes,  in  that  for  a  process  with  displacement  rank  a,  each  reflection 
coefficient  can  be  updated  with  0(a)  multiplications.  For  N  observations  and 
N  reflection  coefficients,  this  requires  0(Nza)  multiplications  as  compared  to 
the  0(N9)  that  would  be  required  if  we  just  used  the  general  formulas  given 
above  without  attention  to  the  displacement  rank. 

The  time-update  formulas  can  be  compactly  stated: 

kfn+l.t  —  (l  ~  77TO.t’?m.j)1/2^:m-i-l,<-l(l  ~  FmJ -l)1/Z  +  Vm.tlYn.t  -1  (lB) 

where  {77, fx\  are  a-dimensional  row  vectors  obeying  the  recursions  (Yule“s  Par- 
cor  identity) 

Vm+l.t  —  ’l-hn.t -1$  (19) 

Pm-M.l  —  Flfan.t-l-km  +  l.t-Vm.t  I  (20) 

where  the  function  F(  )  is  defined  as 

F\A.B,C\  =  (1  -  BB')~l/s{A  -  BC*){\  -  CC*)"V2  (21) 

In  fact,  we  may  note  that  the  first  equation  is  a  rearrangement  of  the  formula. 

=  Ffem  +  l.t  'Vm.t  ’Hm.t -1$  (22) 

=  (/  “  »W*)'1/2(fcmtU  ~V M*)(/  ~  W*Y'/Z 

where  the  subscripts  for  and  rj  have  been  omitted  for  simplicity. 

The  displacement  rank  itself  is  given  as 

a  =  rank  of  {R  -  ZRZ']  (23) 

where  R  is  the  covariance  or  Gramian  matrix  (2)  and  Z  is  the  lower  shift 
operator  with  ones  on  the  first  subdiagonal  and  zeros  elsewhere. 

The  three  recursions  (18)-(20)  can  be  pictorially  represented  as  in  Figure  2. 
The  block  named  D  denotes  a  "delay  operator"  and  the  numbers  indicated 
denote  implementation  of  the  respective  equations. 
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Figure  2.  Representation  of  the  Time-Update  Calculations 
Time-Invariant  Implementations 

Perhaps  surprisingly,  the  displacement  rank  can  be  used  to  reduce  the 
complexity  in  a  different  way-by  allowing  completely  time-invariant  gains  but  of 
a  higher  dimension.  That  is.  we  shall  still  have  a  cascade  of  lattice  sections,  but 
each  section  will  be  defined  by  an  (a-l)-dimensional  row  vector  rather  than  a 
scalar-see  Figure  3.  These  row  vectors  will  be  called  generalized  Schur 
coefficients. 
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Figure  3.  Generalized  Time-Invariant  lattice  Section 

As  shown,  the  input  to  the  'Delay'  is  a  (a-l)xl  signal;  the  'Delay'  itself  is 
where  /a_ i  denotes  an  identity  matrix  of  dimension  (a-l)x(a-l).  It 

Z  —  (Xi  * 

can  be  replaced  by  - —  x/a_, ,  to  yield  an  ARMA  lattice  structure.  The 

1— OjZ 

generalized  Schur  coefficients  J/Q)  are  row  vectors  of  dimension  a-1,  satisfy- 


1  -  Ki  JK*>  0  (24) 

where  a  is  the  displacement  rank  of  the  covariance  matrix  R  of  and  J  is 
defined  through  the  signature  of  R  as: 


sgn|R  -  Z  RZ'j  = 
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Thus  J  is  an  (a-l)x(a-l)  diagonal  matrix  with  entries  +1  or  -1.  Relation 
(24)  can  be  compactly  written  by  using  the  ./-norm  notation: 


\\Ki\\j*  1 

To  conclude  this  introduction,  we  should  also  remark  that  we  shall  show 
that  the  generalized  Schur  coefficients  \Ki\  are  not  unique:  different  set  of 
\Kxl  can  be  associated  with  a  given  covariance  function,  even  in  the  stationary 
case! 

Also,  a  given  set  of  can  correspond  to  several  different  covariance 

functions,  all  ‘'congruent"  to  each  other  (see  Section  4). 
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3.  DERIVATION  OF  THE  TIME-UPDATE  FORMULAS 


The  time  update  formulas  (l8)-(20)  will  follow  very  easily  by  combining  cer¬ 
tain  results  on  partial  correlation  coefficients  and  on  displacement  ranks.  These 
will  be  presented  in  the  first  two  subsections.  The  third  subsection  will  contain 
the  proofs  of  (18)-(29)  and  the  final  subsection  will  give  an  application  to  the  so- 
called  exact  deterministic  least-squares  algorithm  for  "pre-windowed"  data 
sequences. 


3. 1  Parcor  Coefficients  and  Yule*  s  Identity 

The  reflection  coefficients  as  defined  by  (10)  can  be  identified  as  partial 
correlation  (Parcor)  coefficients.  Before  proceeding  with  their  analysis,  it  will 
be  helpful  to  introduce  some  notation  and  derive  some  general  properties  of 
such  coefficients. 

Let  a. 6  be  vectors  in  some  Hilbert  space.  Vfe  shall  denote  the  correlation 
coefficient  of  a  and  6  as 

p(a,b)  =  <a,b>  =  J  ja  j  |~l  •  <a,b>  •  |  jb  1 1-1  (26) 

where  a=||a||-,a  is  the  normalized  version  of  a.  Given  any  set  V  of  vectors, 
let  cT|  v,  or  more  simply  ay,  denote  the  residual  error  in  estimating  a  from  V. 
Thus 


ay  :=  a  -  <a, V>F  (27) 

where  V  is  any  orthonormal  basis  for  the  linear  space  spanned  by  the  set  V. 
Then,  we  shall  define  the  partial  correlation  coefficient  of  |a,6J  given  V,  a 
concept  first  introduced  by  Yule  (1907),  as 

py(a,6)  =p(ay,bv) 

=  |  |ay|  |_l<ay,6y>|  |6y|  |_#  (28) 

In  this  notation,  the  reflection  coefficient  fcm+ u  defined  in  (10)  can  be  written 
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as, 


+  U  -  P 'j[yt  •Vt-m-l) 


where  we  recall  that  (cf.  (5)) 


U  =  \yt-m . V«-i} 


(29) 


We  can  write 


ay  =  a  -  <a,V>V 

=  |  |a  |  |  •  (a  -  <a,V>V) 
=  |  |a|  |  •  (a  -p( a.\>)V) 


and  check  that 


II“kI  I  =  I  |a|  I  '  (/  -p(a.V)p#(a,V 0),/2 


so  that 


ay  =  I  I  |  I  1  a-v 

=  (/  - p{a,V)p'{a,\'))~x/z{a  ~p{a,V)V)  . 

With  a  similar  expression  for  by,  we  can  then  deduce  that 

pv{a,b )  =  p{ay.by)  =  <5y,by> 

=  (/  -p{a,V)p\a,V))-l'z<{d  -p[a,V)V),(b  -p(b.V)  V)> 

■  {I  -p{b,V)p‘(b,V))-''z 

=  (/  —  p(a,  V)p*(a,  V))_1/2(p(a,6)  -  p{a  ,V)p*(b  ,V)){I  -p(b,  V)p*(b,V)) 

(30) 


or  more  compactly 

px(a.b)  =  F\A.B,C\ 

S-  (/  -  Buy  w2{A  -  BC*)(I  -  CC')-,/z 


where 


A  =p(a,b);  B  =  p(a.V);  C  =  p(b  .V)  . 


(31) 


Now  we  shall  seek  a  formula  for  modifying  a  partial  correlation  coefficient 
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*/2 


~  o 


py(a.b)  when  the  set  U  is  changed  to  \U,V\ 

For  this,  note  first  that  the  space  spanned  by  \U,V {  can  be  written  as  a 
direct  sum  =j£/®  Vy\,  where  V<j  is  orthogonal  to  U.  Now  ^  can  be  com¬ 
puted  as  follows. 

First  estimate  a  from  U,  with  error  av\  then  estimate  a,j  from  V,j. 
with  final  overall  error  of  Therefore  we  have  a  nice  formula,  which  the 

reader  should  thoroughly  assimilate: 

a|  'J.v\  =(a'j)v.j 

Then  we  can  write. 

Pf!/.v|(°.&)  =  p{a\<j.v[.b\v.v\) 

-  P((a'j)vrib  v)v,j) 

-  PVj(*<J. b'j) 

which  by  using  (31).  we  can  write  as 

=  PV,jW'J-b’j) 

-  F\p(a,J>b  'j)'P(a'j'V'j),p(b  tj,V<j)\ 

=  F\p<j{a.b).p,j{a,V),p,j{b  ,V)\  (32) 

This  basic  result  first  appeared  (in  different  notation)  in  Yule’s  original  paper 
(1907)  and  therefore  we  shall  call  it  Yule's  Parcor  Identity. 

3.2  Displacement  Ranks,  Differential  Generators,  and 
Pinning  Aggregates 

We  turn  now  to  the  other  key  ingredient  of  the  time-update  formula.  Let  us 
denote 

Re*  =  (^]<*s0  where Ru  =  <yi.yj>  (33) 

If  we  define  a  block  column  vector 
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yc  n  -  [y^y\ . vnY 


(34) 


then  we  can  write 


'Ron  -  <Y  c  .v  y  o  n  >  • 

Now  we  introduce  the  displacement  operator 

— I  Ro..v  =  Ro.v  —  2  R zjf Z 


(35) 


(36a) 


where 


0 

1 


z  = 


0 

i  o, 


The  displacement  rank  is  defined  as 


(36b) 


a  =  rank(_|R)  (37) 

Since  RQJV  is  Hermitian,  so  will  be  _|RC:jv-  Therefore,  it  has  real  eigenvalues, 
say  g  +  strictly  positive,  and  g_  strictly  negative,  so  that 


a  =  +  ?- 

Then  we  can  write  _|RC^,  although  nonuniquely,  as 


where 


and 


R 


0  :N 


—  Z  R(j:^  Z  =  Gc  ,v  E  Go- 


c.v 


J0:N 


Gq.N  = 


■  9  o 
9 1 


■  9n 


(38) 


(39) 


(40) 


(41) 
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( ConX )  will  be  called  a  ‘Differential  Generator'  of  Rcjv.  since  it  is  not  hard  to 
see  that  Ron  can  be  uniquely  recovered  from  knowledge  of  \G0.n£\ 


An  interesting  geometric  interpretation  can  be  associated  with  (39).  rewrit¬ 
ten  as 

Ron  ~  Co  n  £  Cc\v  -  Z  R  Z*  (42) 

Delosme  and  Morf  (1980)  showed  that  we  could  find  a  collection,  X,  of  vectors. 
a  in  number,  in  an  extended  Hilbert  space,  such  that 

<X,X>  -  E  and  Gay  =  <Y0n,X>  (43) 

We  may  remark  that  such  so  called  "pinning  aggregates"  were  first  introduced 
by  Sidhu  and  Kailath  (1975)  to  obtain  a  geometric  derivation  of  the  state-space 
Chandrasekhar  equations. 

Now.  in  terms  of  X,  we  can  express  (42)  in  the  suggestive  form, 

<YOJv.Yo:W>  “  ^oN'XXX.Xy-'KYoy'X^  =  <Z  Y0:N.Z  Yc,v>  (44) 
For.  noticing  that  the  residual 

(Ye  w)*  =  Y0jv  ~  <YC»,X><X,X>-'X 
equation  (44)  shows  that 

<(Yo:jv)^.(Yo:^)Ar>  =  <Z  Yq y.Z  Y0;At>  (45) 

This  equality  allows  us  to  set  up  an  isometry  (i.e.,  a  norm-preserving  iso¬ 
morphism)  between  the  spaces 

{space  of  vectors  (Yojv)*I  ~  {space  of  Z  Y0jvj  (46) 

In  particular,  we  have 

(Vo)x  ~  0  and  ( yt)x  ~yt-i  (47) 

Now  returning  to  our  problem,  we  recall  that 


w 


where 


=  p[(vt)'jXyt-m-i)rj] 
=  p 'Ay 


u  =  \yt~m . yt-x\ 

It  will  be  convenient  to  introduce  an  operator  D  such  that 

Dyt  =  yt  -i .  t  >  1  ■  (48a) 

DU  =  \yt-m-i Vt-  ij  (48b) 

With  this  notation,  what  we  are  seeking  in  the  time  update  formula  is  a  relation 
between 


=  PAyt.y-m-i)  and  =  poADyt.Dyt^t)  (49) 

3.3  Derivation  of  Time-Update  Formulas 

With  the  help  of  the  isometry  and  Yule’s  Parcor  identity,  we  can  derive  the 
time-update  formulas  as  follows. 

Using  first  the  isometry  (48)  and  then  Yule’s  identity  (32).  we  can  write 

-  PoADyt-Dyt-m-i) 

=  p>jx{{yi)xAyt-m-\)x) 

~  P[  !/.AT|(V<  >Vt  -m  -l) 

=  F\pAyt  yt  -m-i  ).pAvt  .x),p  Avt-m-i.x)\ 

~  ^'Vcm*\.t>T}m,t’Pm..t-\\  (49) 

where  we  have  defined 

Vm.t  ~P'Ayt<x)  and  pAVt-m-vx)  (50) 

We  can  readily  obtain  recursions  for  these  quantities  as  well.  Thus 

Vm*l.t  =  PiVt-m-V'Jl(Vt'X) 

=  F\pAvt.z)'pAvtyt-m-i)-pAxyt-m-i)\ 

=  (5l) 
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and  finally, 


Protl.f  ~  P\  rJ.yt  j(l/<-m-l*®) 

=  F\P  j{y t-m- 1.*  )  -P  'J  iVt  -m-l.Vt).P'Ax  ,yt)\ 

—  F\Pm.t ‘Vm.t  i  (52) 

Equations  (49),  rewritten  to  express  km¥lt  in  terms  of  Arm<.lt_,f  combined 
with  equations  (5l)-(52)  are  just  the  time-update  formulas  presented  earlier  in 
Section  II.  Note  how  immediately  they  follow  from  the  isometry  and  Yule's  iden¬ 
tity. 


3.4  Application  to  the  Time-Series  Pre-Yfindowed  Case 

In  this  section,  we  shall  apply  the  theory  developed  so  far  to  a  deterministic 
least-squares  problem,  studied  by  Morf,  Vieira  and  Lee  (1977)  and  Lee,  Morf  and 
Friedlander  (1981). 


(i)  Statement  of  the  Problem 
Given  a  series  of  samples 

l/(0).y(l) . y(t) . y(/<0 

in  the  so-called  pre-windowed  method,  we  try  to  fit  an  nth  order  autoregression 
to  it  by  choosing  coefficients  jl.Oj . a? to  minimize  the  following  quantity. 

£  •»“(<)  (53) 

t=o 


where 


with 


en(0  =  V(0  +  «iV (*  -  1)  +  "  '  +  *ny(t  ~  n)  (54a) 


y(-fc)  =  0.  k*  1  . 


(54b) 


The  assumption  (54b)  means  that  we  are  assuming  that  samples  prior  to 
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t=  0  are  zero,  hence  the  name  "pre-windowed".  The  residuals  en(l)can.be\vrit- 
ten  in  a  matrix  form  as  below: 


[“n 


0 

,  0  y  (O' 
ly(0)  y(i 


y(o)  1/(1)  ■  y{N-n) 


.  y{N) 


=  [*n(0) 


(55a) 


or  more  compactly 


A„  •  Y(n)  =  e^v  (55b) 

The  solution  to  our  r  .iramization  problem  can  then  be  seen  to  be  the  solution  of 
the  normal  equations. 


A„Rn  =  [0  •••/$] 


(56) 


where 

Rn  =  Y(n)y*(n)  and  K  =  min  £  e*(t)  =  |  | .w  I  Is  • 

»=o 

Note  that  Rn  will  not  in  general  be  a  Toeplitz  matrix  (unless,  in  fact, 
y(N  -  n  +  1)  =  0  =  •••  =  y{N)). 

(ii)  Solution  of  the  Problem 

We  can  cast  this  deterministic  least-squares  problem  into  our  previous  Hu¬ 
bert  space  framework  as  follows: 

Define  our  Hilbert  space  elements  as  row  vectors 

yt  =  [0  •••  0  y(0)  •••  y  (f )]  torO^t^N  (57) 

of  some  fixed  length  T,  T  >  N.  And  take  as  inner  products, 

<y*.y»>  =  Vi  •  Vt 


9  [o  •  •  •  o  y(o)  •  •  •  y(03[o  •  •  •  o  v(o)  •  •  •  y(s)]' 
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i=0 


(58) 


where  tAs  stands  for  the  smaller  of  t  and  s.  Then,  we  see  from  (55a)  that 

choosing  (1,  aj . a^l  to  minimize  ^  e^(f)  is  equivalent  to  finding  the  best 

(=0 

prediction  of  the  Hilbert  space  vector  y#  given  n  earlier  vectors 
h/jv-l.  ••  •.  yN -n  } • 

We  can  now  use  our  earlier  general  result  as  follows:  First,  we  must  find  the 
displacement  rank  a  and  the  pinning  vector  X.  By  our  definition  of  inner  pro¬ 
duct,  we  have 


R 


o:N  ~ 


Vo 

Vl 

'yn 


{Vl .  Vv] 


Thus. 


I Ro :N  =  R0:tf  “  ZR0:NZ* 


=  [VtVs*  -  Vt  -iV**— i  ]<.s=0:Ar  .  y-i  =  o  for  i  i  1  (59) 

Note  that  the  expression  in  the  square  brackets  denotes  the  {t  ,s)-th  element  of 
_  1 R0 .  with  t  and  s  taking  values  from  0  to  N  as  indicated  in  the  sub¬ 
script.  Substituting  yt  :=  [0  •••  0  y(D)  •••  y(f)].  we  get 

-iRodv  =  [v(0v*(s)  -  y{t-T)y\s-T)]tJ=0:N 

=  [v(Ov#(s)]t.,=0jy  since  y(-i)  =  0  for  i  1  .  (60) 

Consider  the  following  nxn  submatrix  of  _|R0:jv.  denoted  briefly  as 
[— |RLv-r»:tf: 

[_|R]*-**  =  [v(0v'(s)3».*=,v-«jv 


h/(N  -n) 


y{N) 


[l][l/'(N--n)  •••  y*(//)] 


(61) 


h 


=  G N-n-.N  ’  £  ■  Gnix.n  .  say. 


We  see  that  the  displacement  rank,  which  equals  the  dimension  of  E,  is 

a  =  1  .  (62) 

Also  that  the  pinning  vector  is 

X  =  [0  •  •  •  0  1],  (63) 

because 

<[VN-n .  VnY.  X>  =  [y*(Ar-n) .  y‘{N)]‘ 

~  Gn-h  X 

With  we  can.  as  before,  build  a  lattice  filter  with  n  sections,  where  the 

(m  +  l)-th  section  has  reflection  coefficient 

fcm  +  1.1  =  P'Ayt.Vt-m- 1)  •  U  =  IVt-m .  Vt-ll  • 

To  update  this  quantity,  we  need  where 

% 

Vm.t  =  P'Ayt.X)  and  p^.t-1  -  pAM-m- i>X)  ■ 

In  the  present  problem 

<yt,X>  -  y(t)  and  <yt-m-i.*>  =  y(t-m- 1)  .  (64) 

Then  some  calculation  will  show  that 

Vm.t  =  em(t)c  ,  nmt  =  rm(f-l)  c  (65) 

where  the  proportionality  factor  c  arises  from  the  absence  of  a  scaling  factor 
(corresponding  to  a  so  called  second  normalization  in  Lee  et  al.  (1981)).  The 
point  really  is  that  the  lattice  sections  need  only  to  propagate  the  sample  values 
$ffm(f),  fm(f-l)j  and  not  the  row  vectors  We  refer  to  the  paper 

of  Lee  et  al.  (1981)  for  further  details  on  the  prewindowed  lattice  form. 
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4.  GENERALIZED  SCHUR  ALGORITHMS  AND  TIME  INVARIANT  LATTICE  FILTERS 

We  shall  begin  by  considering  tests  for  positivity  (positive-definiteness)  of  a 
given  matrix,  starting  with  Toeplitz  matrices,  which  can  arise  as  covariance 
matrices  of  stationary  processes.  A  simple  test  is  shown  using  Schur 
coefficients  (=  reflection  coefficients)  obtained  via  the  standard  Schur  algo¬ 
rithm.  Then  a  generalized  Schur  algorithm  will  be  developed  yielding  general¬ 
ized  Schur  coefficients  which  will  give  positivity  tests  for  non-Toeplitz  matrices. 
Then,  the  interpretation  of  the  Schur  algorithm  is  given,  using  concepts  of  spec¬ 
tral  functions  and  Schur  invariance.  Later,  we  shall  consider  these  results  in  the 
context  of  nonstationary  process,  and  discuss  prediction  filters  and  the  general¬ 
ized  Szego-Levinson  algorithm.  We  shall  briefly  also  mention  orthogonal  polyno¬ 
mial  and  Christoffel-Darboux  formulas. 

4.1  Positivity  of  Toeplitz  Matrices 

Consider  a  stationary  stochastic  process  \yt  $  with  an  associated  Toeplitz 
covariance  matrix 

R0  R\  Rz  •  ■  ■  Rn 

R\ 

Rz  ■  •  • 

~Rq:N  •  ...  ,  R  0-1 

•  ■  R\ 

Rn  R\  Rz, 

There  are  many  criteria  for  testing  whether  a  given  matrix  is  positive 
definite,  based  on  computing  eigenvalues  or  by  reduction  to  a  sum  of  squares, 
etc.,  but  we  are  seeking  more  special  tests  that  exploit  the  special  (Toeplitz) 
structure  of  the  matrix.  Such  tests  are  in  fact  available,  based  on  certain 
function- theoretical  characterizations  first  developed  in  the  early  part  of  this 
century. 

Let  us  temporarily  assume  that  we  have  an  infinite  sequence 
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. 


and  define  a  function  (assuming  R0  =  l  ) 

C{z)  =  1  +2  £  Kz*  .  (66) 

l 

Then  we  have  the  following  characterization 

Lemma  (Caratheodory  (1910)  ):  The  matrices  $R0Af,  N  =  0, !....{  will  be  nonne¬ 
gative  definite  if  and  only  if  C(z)  is  analytic  and  has  nonnegative  real  part  in 
the  unit  disc. 

Testing  if  Re  C(z)ss0  can  be  replaced  by  the  generally  simpler  problem  of 
testing  for  boundedness  by  making  the  transformation 

5(Z)  =  [l-C(z))[l  +  C(Z))-1  (67) 

Then  it  is  not  hard  to  see  that  C{z)  is  analytic  and  ReC(z)  Si  0  in  \z  |  ^  1  if 
and  only  if 

S(Z)  is  analytic  and  !S(z)]  *s  1  in  |Z  |  <  1  .  (68) 

Such  a  function  will  be  called  a  Schur  function. 


Lemma:  (Schur  (1917)):  Let  S0(Z)=S(Z)  and  define 


<^i+l(z) 


1  5t(z)-Sj(0) 

Z  1  -5t(0)5t(Z)  ’  1  ~Ul1 


(69) 


Then  S(z)  is  a  Schur  function  if  and  only  if  {St(z)J  are  also  Schur  functions. 
Furthermore,  let  us  define  what  we  shall  call  Schur  coefficients 


*i+.=Sx(0)  . 


(70) 


Then  S(z)  is  a  Schur  function  if  and  only  if  jfc,  j^si  for 


These  are  in  fact  the  reflection  coefficients  used  to  construct  the 

time-invariant  filter  for  a  stationary  process  (cf.  Section  2),  a  connection  first 
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noted  by  Dewilde,  Vieira  and  Kailath  (1976). 

Some  examination  will  show  that  the  computation  of  \kix  1  needs 

knowledge  only  of  [R0=-l,Ri . R^\  and  that  therefore  we  in  fact  have  a  test  for 

any  finite  matrix  R0 .s'- 

Rojv  ^  0  if  and  only  if  1^  |  ^  1  .  1  <  i  <  N  .  (71) 

Moreover,  further  reflection  will  show  the  reasonableness  of  the  following  result 

and  corollary. 

Theorem:  The  covariance  matrix  Ror/  can  be  extended  by  adding  terms 
\Ri,i>N\  corresponding  to  added  reflection  coefficients 

fci.  such  that  |A^|<l,i>V. 

Corollary:  The  ’maximum  entropy”  extension  is  achieved  by  the  choice 

=0,  i>N\ 


Before  leaving  this  section,  we  remark  that  the  reflection  coefficients 
{ki,  l*s i<N]  can  be  computed  by  operations  on  the  first  N  coefficients  of  the 
functions  {1  +  C(z),l-C(z)|.  in  particular  on  the  matrix 


1 

0 

Rx 

-Rx 

Rz 

-Rz 

Rn 

-Rn 

The  details  of  the  algorithm  will  be  shown  later  (Section  4.4). 


4.2  Positivity  of  Non-Toeplitz  Matrices 

Suppose  now  that  the  process  is  not  stationary  so  that, 

Then  the  associated  covariance  matrix  will  not  be  Toeplitz  and  we  might  ask  how 
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we  could  extend  the  previous  tests  for  the  positive  definiteness  of  Rc,v  We 
could  try  to  introduce  Caratheodory  and  Schur  functions  of  2-variab!es.  For 
example, 


C[z  ,cj)  =  R0  0 
S  (z  ,cj)  = 


+  2  £  E 

1=1  j=l 

*C,C  -  c{z  ,'j) 
fta.o  +  G(z  ,cj) 


(72a) 

(72b) 


But,  nothing  satisfactory  seems  to  have  been  found  in  this  way,  at  least  so  far 

However,  we  shall  show  that  the  concept  of  displacement  ranks  and 
differential  generators,  as  introduced  in  Section  3.2,  does  allow  us  to  make  pro¬ 
gress. 


4.3  Differential  Generators  of  R  : 


We  recall  from  Section  3.2  that  we  can  then  write,  for  nonunique  Gc  jV, 

—I  Rc.V  =  Ro;Ar  —  2  R0JfZ  —  Gq  n  E  Go.jW  (73) 


where  G0Jj  is  an  (N  +  l)xa  matrix  with  rows  [g^.g^ . gs j,  say,  and  I  is  a  sig¬ 

nature  matrix  for  _|Rc..,y. 


E  = 


0 


G0N  is  nonunique,  because  any  product  G^NU  with  a  E-unitary  matrix  U 
will  also  satisfy  (73).  One  consequence  is  that  we  can  always  choose,  for  conveni¬ 
ence, 

9o  =  Wo2  0  •••  0],  (74) 

This  choice  is  a  natural  one  because  the  reader  can  check,  it  allows  us  to  make 
the  first  column  of  G0N  identical  to  the  first  column  of  R0:Ar.  R  for  some  reason 
we  have  a  generator  without  this  property,  an  appropriate  E-unitary  matrix  U 
can  also  be  chosen  to  ensure  it~wc  shall  give  the  precise  construction  a  little 
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later  (see  Sec.  4.4,  Step  lb). 

Having  £0jv,  we  can  now  associate  a  complex  function  of  a  single  variable 
with  a  non-Toeplitz  matrix  R, 

G(z)  =  £  gx2*  +  0(zN")  (75) 

<=c 

Note  that  this  function  is  unlike  the  extended  Caratheodory/Schur  functions, 
which  are  functions  of  two  variables  To  gain  some  insight  into  this  function 
G(z),  let  us  examine  it  for  the  stationary  case. 

G(z  )  in  The  Toeplitz  Case 

When  Rc  jv  is  Toeplitz,  we  see  that 


*0 

Rx  ■ 

■  ■  Rn 

0 

0  .  . 

.  0  1 

Rx 

Rc  ■ 

0 

Re  R  i 

■ 

1 

• 

:  i  Rx 

• 

■.  Rx 

Rn 

■  Rx  Rc 

0 

Rn- i  •  • 

.  R0 

7?o  R  i 

Ri 

Rn 


■  Rn 

0 


Therefore  a=2,  q+=q~=l  and 


2  = 


1 

0 


0 

-1 


We  can  also  check  that 


(76) 
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G°  N  ~  ITiTz 


Rio 


the  Caratheodory  function, 

C(z)  =  1  +  2  £  Ri z‘  +  C(zv+1) 


Xq 

0 

Rt 

-R  i 

R  2 

-Rz 

Rn 

-rn 

0=1- 

Then 

A'+l) 

(77) 


i=l 


we  see  that 


G{z) 


1  +  C(z)  1  -  C(z) 
2  2 


Note  that  the  entries  of  G{z)  are  precisely  those  defining  the  Schur  function 
S(z)  in  the  Toeplitz  case,  which  was  introduced  in  a  rather  mysterious  way 
then.  Here  a  clue  begins  to  appear  to  establish  a  positivity-test  for  the  non- 
Toeplitz  covariance  matrix.  For  reasons  of  space  we  shall  forego  examination  of 
how  the  Schur  algorithm  in  the  stationary  case  can  be  recast  in  terms  of  G{z), 
instead  we  shall  directly  present  the  (very  easy  to  describe)  general  Schur  algo¬ 
rithm: 


4.4  General  Schur  Algorithm 

In  the  non-Toeplitz  case,  G(z)  will  not  have  any  such  striking  form  as  in 
(53),  but  it  will  nevertheless  turnout  to  yield  generalized  Schur  coefficients  in  a 
nice  way.  Also  the  procedure  includes  the  special  method  for  the  Toeplitz  case. 
Note  first  that  in  the  general  case,  Gcjv  will  have  a-columns,  which  we  partition 
as  follows: 


Gq.n  - 


a0  | 

7i 

•  I 

:  i 

7N 


0  0 
r« 


r V 


(78a) 
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Here  we  have  exploited  the  nonuniqueness  of  G0..v  to  make  the  first  row 
[a0  0  •  ■  •  0],  where 

*c.o  =  =  p0  £  >0  (?8b) 


In  a  corresponding  way,  we  partition  £  as  follows: 


£  = 


1 

0 


0 

-/ 


where 


-J  - 


(79a) 


(79b) 


Thus  £  and  J  are  axa  and  (a— l)x(a-l)  matrices.  In  order  to  present  the 
Schur  algorithm,  we  need  a  further  definition. 


£-Unitary  Matrices:  Matrices  of  the  following  type  play  an  important  role  in  the 
ensuing  theory: 


Q(K,J)  = 


(1  -KJK‘YUZ  0 

1  -KJ 

0  (J  -  K'K)-'/z 

-1C 

(BO) 


We  can  verify  that 


0£  0*  =  £ 


(81) 


thus  being  called  £-unitary  matrices.  In  fact,  it  can  be  shown  that  an  arbitrary 
£-unitary  matrix  will  have  the  form 


U  x 
0 


0 

Uz 


Q(K,J)  where  f/j  is  unitary  and  t/2  is  /-unitary  . 


(82) 


The  Generalized  Schur  Algorithm: 

Start  with  Gojv  in  the  'proper'  form,  i.e.,  such  that  the  first  row  g0  of 
Gq  //  is  of  the  form 


*30- 


the  rows  r\. 


G(»  = 


■  0], 

c 

with 

one 

oc 

‘  1 

7i 

7s- 1 

-  .V 

(83) 


On 


G(z)  =  V  5tz‘  . 

i=0 


this  operation  amounts  to  forming 

G<»(z)  =  G(z) 


/  0 
0 


-l 


(84) 


Step  I b.  Renormalization:  Find  so  that  G  vl^0  is  again  proper, 

i.e., 


7  ll) 

o 

. 

o 

7&i 

rV>, 

GW  =  GWQ-'iK^J)  = 


we  shall  show  that  this  can  be  achieved  by  chocsing 

Ki  —  — £7 Q  \J  ■ 

Proof  of  (86):  From  (85)  we  need  0( KltJ)  to  be  such  that 

the  1st  row  of  C  ©1  =  [a,  0  •  •  O]0(/Ti./) 

=  a,  1st  row  of  Q{Ki.J) 

=  a^l  -  KiJK[)~'/z[\  -K,J)  . 

Comparing  with  (83),  we  get 


(85) 


(86) 
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S’’ 


a0  =  cri(l  -  KiJK'x)~wz  and  Tt  =  -  o0AV  .  Hence  .... 

Steps  II.  Ill,  etc:  The  two  steps  as  before,  namely,  shifting  and  renormalization, 
are  repeated  until  G^m'>  is  a  matrix  with  all  columns  zero  except,  perhaps,  for 
the  first  column.  At  this  stage,  no  more  renormalization  is  required  and  all 
future  j/f,J  will  be  zero.  Note  that  m<N,  the  size  of  the  original  matrix  and  if 
m=N.  G^  will  consist  of  only  a  single  row. 

In  this  way  ,  we  can  associate  a  set  of  lx(a-l)  vector  coefficients 
\Ki.Kz . Kn]  with  G0 :N  and  hence  with  Rc  ^ .  The  following  result  holds. 

Theorem:  The  matrix  Rojv  is  positive-definite  if  and  only  if  the  set  of  Schur 
coefficients  associated  with  it  satisfy 

1  -  KxJK->  0  for  i  =  1.....JV  .  (87) 

In  the  Toeplitz  case  a= 2,  J=\  so  that  have  to  be  scalars  of  magni¬ 

tude  less  than  one.  as  we  stated  earlier  We  have  a  striking  generalization  here. 

We  shall  not  give  a  formal  proof  of  this  theorem  here,  but  we  shall  outline 
the  underlying  ideas  in  the  next  section. 

4.5  Interpretation  of  Schur  Procedure 

The  theorem  rests  mainly  on  the  concept  of  Schur  complements  and  a 
lemma  due  to  Schur.  Let  us  write  R„  N  as. 


R,  ,0 

0  .  .  .  0 

Ro  .0  [Ro.o 

•  Roj/]  + 

Rq  n  - 

■ 

A 

RoJi 

0 

Then  A  is  called  the  Schur  complement  of  R0t0  in  R.  Then  the  following  result 
is  easy  to  see. 
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Lemma  R  is  positive  definite  if  and  only  if  R00  >0  and  the  Schur  comple¬ 
ment  of  Ra0  in  R  is  positive  definite. 

The  following  reformulation  of  this  fact  will  be  useful.  Let  us  define 

S(z,o)  =  £  RijZ^’i  (89) 

<j=o 

If  R  =  [tfij]  is  positive  definite,  this  function  will  be  said  to  be  the  two- 
dimensional  spectral  function  of  the  process.  (Caution:  The  spectral  function 
S(z,w)  must  not  be  confused  with  the  two  dimensional  Schur  function  S(z.w) 
suggested  in  (72)).  Then  we  can  restate  the  lemma  as  follows: 

Lemma-  The  function  S (z.o)  associated  with  R  is  a  spectral  function  if  and 
only  if  S(0,0)  =  R0  0  >  0  and  S^z.cj)  is  also  a  spectral  function,  where 

s,(*.u>  =  S(«.“)...-..S(«.P)S--(0,0)S(0.B)  m 

Z  U 

The  reader  should  check,  by  comparing  coefficients  of  ziu'i  that  S^z.w) 
indeed  represents  the  Schur  complement  A  in  the  transform  domain. 

The  above  lemma  may  be  extended  to  the  "ARMA  case"  readily,  which  we  do 
here  in  order  to  bring  out  connections  to  the  paper  of  Dewilde  in  this  volume. 


Lemma  (Schur-ARMA):  Let  "a"  be  point  in  the  domain  of  convergence  of 
S(z.cj).  Then  S(z,w)  is  a  spectral  function  if  and  only  if  S(a,a)  >  0  and 
Sj(z  ,u )  is  also  a  spectral  function,  where 


Si(z.w) 


S(z ,u)  -  Sfz.tQS-Ha.cQSta.cj) 


z  -  n  _ 
1  -  a'zj(l 


u  -  a 
-  a'u 


(91) 
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These  lemmas  are  of  little  practical  use  as  it  is.  because  computation  of 
Sj (z  ,o)  from  S (z  ,d)  is  not  efficient.  So,  we  now  try  to  impose  a  certain  struc¬ 
ture  on  S(z.cj),  which  will  hopefully  enable  us  to  perform  recursions  on  some 
simpler  function  than  S(z  ,cj). 


a.  Schur  Invariance: 

First,  let  us  look  for  a  S(z,o)  that  remains  unaltered  under  Schur  comple¬ 
mentation.  i.e.,  such  that 

Si(z ,«)  =  S(z  ,cS)  (92) 

Substituting  this  into  (9l)  gives,  after  some  algebra,  the  fact  that 

(l-2u*)S(2,y)  =  Ga(z)C0*(y)  (93a) 

where 


Ca(z)  = 


-  a*z)S(z  ,n)S~1/2(n,g) 

(i-M  2)1/z 


(93b) 


Let  us  pause  a  little  here  to  examine  the  AR-case.  Then  a  =  0.  and  (93) 
becomes 


(1  -  zu*)S(z  ,to)  =  G(z)G*(y) 


(94) 


where 


G(z)  =  S(z.O)S’1/z(0,0) 

In  the  "time-domain",  (94)  translates  to  the  relation 

R  -  ZRZ*  =  GG*  (95) 

Processes  satisfying  (94),  i.e.,  such  that 

S(z.w)  =  G(z)  1  ■  ,  G*(u)  (96) 

l  —  z  u 

have  a  simple  interpretation.  Consider  "one-sided”  white  noise  |ut.  fc  &  Oj 
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such  that  Ehikuf  =  I6U.  Its  spectral  function  is 

Su(z.<y)  =  £  <5V  =  - — - — —  (97) 

ij  =o  *  Z  CJ 

Now  the  process  in  (96)  can  be  interpreted  as  the  nonstationary  process 
obtained  by  putting  "one-sided"  white  noise  through  a  time-invariant  filter  with 
transfer  function  C{z).  These  are  perhaps  the  simplest  kind  of  nonstationary 
processes,  with  displacement  rank  a  =  1. 

Let  us  now  return  to  our  main  problem,  of  imposing  certain  useful  struc¬ 
ture  on  the  processes  to  be  considered.  Our  earlier  condition  that  S(z,cj) 
remain  unaltered  under  Schur  complementation,  which  resulted  in  S(z,y)  tak¬ 
ing  the  form  in  (96).  can  be  generalized  as  follows: 

Let 


S(z,cj)  = 


Ga{z)ZG;[o) 


1  -Z-S 

where  Ga(z)  will  be  required  to  be  "proper"  in  the  sense  that 

Ga(a)  =  1  0  •  •  •  0] 

and 


(98) 


(99a) 


cr|  =  S(a,a)(i  -  |  a  !2) 

The  Schur  complement  is  given  as  below  for  the  ARMA-case: 


(99b) 


S(z  ,u) 

i  -  S(z,a)S  *(a ,a)S(a,cj) 

z  -  a 

y  -  a 

- 

,1  “  a  Z  . 

1  —  a*y 

Substituting  (98)  into  the  expression  (91)  shows  that  S,(z,«)  must  have  the 

form 


S  i(z,w) 


1  —  zv* 


(100a) 
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where 


Gi(z) 


C.(*) 


i 

0 

n 

z  -  a 

i 

u 

1  -  a  z 

Q-\KUJ) 


where  Kx  is  such  that 

G\(az)  =  ^o2[l  0  '  •  •  Oj 


(100b) 


(100c) 


Thus  by  considering  processes  whose  spectral  function  can  be  written  as  in 
(100a)  we  are  able  to  reduce  recursions  of  S(z,cr)  to  recursions  on  G0(z).  The 
G„ -recursions  are  related  through  0-matrices.  So,  we  can  continue  this  above 
procedure  by  extracting  0-sections  at  a2,  a3,  •  •  •  .  Note  that  after  one  step,  we 
have 


G(z) 


G,(z) 


i 

0 

0 

z  —  a 

1  -  a*z 

/«- 1 


0  (*,./) 


(101a) 


=  Gj(z)  •  0t(z)  . 


(101b) 


The  process  will  be  finite  order  if 

G(z)  0f*(z)  ••  0 ul{z)  =  Cu{z)[  1  0  •  •  •  0)  for  M  <  «.  (102) 

We  shall  make  this  assumption  for  convenience,  although  it  is  not  essential. 
Thus, 

G(z)  =  GA/(z)[l  0  ■  •  •  O]0*(z)  •  •  •  0,(z)  (103) 

Note  that  the  product  of  0  matrices  is  again  a  ^-lossless  matrix. 

b.  Congruence: 
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Here  we  will  show  that  the  Schur  coefficiens  AT*  (and  hence  0  -  sections) 
do  not  uniquely  determine  a  process.  In  fact,  let  h(z)  be  analytic  and  inverti¬ 
ble  in  some  region  containing  the  points  dj,  a2,  ...  au.  Then  G(z)  and 
h{z)G{z)  have  the  same  Schur  coefficients.  For, 

h(z)G(z)  =  [ h(z)y(z )  h(z)r(z)] 

and  since  we  work  with  ratios,  it  is  suggestive  that  the  Schur  coefficients  will  be 
the  same.  So,  G{z)  and  h{z)G(z)  are  said  to  be  congruent.  The  following 
lemma  shows  the  relation  between  covariance  matrices  of  congruent  processes. 


Lemma:  If  G{z)  is  replaced  by  h(z)G{z).  i.e.,  G(z )  -»  /i(z)C(z),  then  the 
covariance  matrix  is  transformed  as: 


R„jv  -  L{h)RoNL\h)  . 


0 


hi  h0 


(104) 


Proof:  In  the  time  domain, 

G(z)  -♦  h(z)G(z )  ,  implies  that  G  -*  L{h)G  . 

Now  from 


R  -  ZRZ '  =  GUG* 


we  can  write 

LRL'  -  LZRZ'L *  =  LGZG'L' 

Since  ZL  =  LZ,  we  have 

LRL*  -  Z{LRL')Z *  =  LCEC'L*  . 

which  shows  that,  under  congruence  of  the  generators,  we  have  congruence  of 
the  covariance  matrices,  i.e., 
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R  -♦  LRL '  . 


Since  the  /^-coefficients  define  a  covariance  only  up  to  congruence,  we  shall  now 
consider  a  canonical  representative. 

c.  Canonical  Representative: 

It  can  be  shown  that  every  generator  G(z)  can  be  transformed  by  a 
congruence  into  a  canonical  form. 

G(z)  :=  [1  0  •  •  •  0]3.v(z)  ■  •  0,(z)  (105) 

Note  that  G(z)  is  completely  determined  by  the  Schur  coefficients. 

So  far  we  have  spoken  only  of  tests  for  positivity  and  of  the  generalized 
Schur  coefficients.  Through  the  canonical  generators  we  shall  be  able  to  relate 
these  to  the  prediction  or  whitening  filters  that  define  the  residuals  jem  ,  J 

4.6  Innovations  Filters  and  Generalized  Orthogonal  Polynomials 

As  we  just  stated  there  is  a  transformation  relating  G(z)  to  G{ z),  which 
we  can  write  as, 

a  (z)G(z)  =  G(z)  (106a) 

Let  Go:m  denote  the  first  m  rows  of  G0  y.  Then  we  can  find  a  polynomial 
Om{z)  of  degree  m  such  that 

Om(z)  •  G{z)  =  Gm{z)  +  0(zmM) 

We  shall  write  this  as 

°m(z)G(z)?  6^(z)  (106b) 

where  2  denotes  equality  of  the  first  m  terms.  Then  after  some  calculation, 
for  which  we  refer  to  the  thesis  of  Lev-Ari  (1982),  we  can  claim  the  following. 
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THEOREM  The  function  0^(2)  defines  the  innovations  filter  at  time  t  =m 


This  theorem  does  not  indicate  how  to  actually  compute  the  ia771(2)|.  To  do 
this  conveniently,  we  introduce  the  concept  of  admissibility. 

ADMISSIBILITY:  A  generator  G{z)  is  called  \-admissible  if 

G(z  )A  =  1  (107) 

where  A  is  a  vector  of  dimension  a,  i.e., 


9o 

X, 

1 

0 

9n 

K 

6 

[Lerer  and  Tismenetsky  (1931)  introduced  similar  constraints  to  study  general¬ 
ized  Bezoutians.]  We  know  that 

am(z)C(z)  a  <5m(z) 

Multiplying  both  sides  by  A.  and  using  (107)  gives  us  ^(zjas 
am(z )  —  Gm[z)\  . 

=  [10  •  •  •  0]Om(z)  •  •  •  0,(z)A  (108) 

We  can  represent  this  in  block  diagram  form  as  shown  in  Figure  4.  Each  section 
in  this  cascade  structure  has  the  generalized  lattice  form  shown  in  Figure  3. 
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Figure  4.  Generalized  Lattice  Filter  with  Time-Invariant  Sections 


Orthogonal  Polynomials 

The  theorem  that  the  0^(2)  define  the  prediction-error  filters  means  that 
they  have  certain  orthogonality  properties.  In  fact,  they  turn  out  to  be  natural 
extensions  of  the  Szego  polynomials  orthogonal  on  the  unit  circle  (see,  e  g., 
Geronimus  (1961),  Kailath.  Vieira  and  Vorf  (1978)).  If  we  denote  the  other  out¬ 
puts  from  the  cascade  0m(z) . ©i(z)  by  Qniz),  an  (a-'.)-dimensional  vector, 

then  we  clearly  have  the  recursive  updating  formula 


(109) 


which  generalizes  the  well  known  Szego  recursion  formula.  The  formula  (109)  is 
(a  slight  generalization  of)  the  generalized  Levinson  recursions  obtained  by 
FYiedlander,  Kailath.  Morf,  Ljung  (1978),  (1979).  The  present  derivation  is  much 
more  transparent,  and  in  fact  it  was  in  large  part  the  search  for  such  a  simple 
derivation  that  led  to  the  studies  presented  here. 


“m(*) 

“m- l(z) 

=  ©m(z) 

Cm{z) 

cm- i(z) 
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Christoffel-Darboux  Formulas 


With  orthogonal  polynomials,  we  always  have  the  so-called  Christoffel- 
Darboux  formulas,  which  give  expressions  for  R_1.  In  our  case,  we  have  the  fol-' 
lowing  generalized  forms. 

Let 

(*•«):=[!  2  •••  «  •••  umy  (HO) 

and  also  define  the  "reciprocal  polynomial"  by 

“m(2)  =  zm[am(l/z  ')]'  (111) 

Then  we  can  establish  the  generalized  Christoffel-Darboux  formulas, 

Km{z.u)  =  £  ai(z)a{(cS)  (112) 

i=0 

=  ttmtl(z)ttmn(2J)  —  Cm  +  l(2)‘^^'m  +  l(w)  +  ^  (i!3) 

The  only  surprising  thing  about  this  generalized  formula  is  the  presence  of  the 
term  X'EX.  Some  light  will  be  shed  on  this  in  the  next  subsection. 

4.7  More  on  Generators  and  Admissibility 

Now  we  address  ourselves  to  the  question,  can  we  always  find  admissible 
generators?  The  answer  is  essentially  ’yes’,  which  we  shall  elaborate  now.  First, 
we  note  the  concept  of  equivalence  and  certain  results  due  to  Livsic  (1979)  and 
Potapov  (1960).  Equivalent  generators  and  \G2,Zz\  are  such  that 

R-ZRZ  =  GiEjCj  =  G2&2G2 

The  following  results  are  an  adaptation  of  results  proved  by  Livsic  and  Potapov. 

1.  All  minimal  generators  are  related  by  E-unitary  matrices. 
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2.  In  general,  all  generators  can  be  obtained  from  any  minimal  generator  by 
some  combination  of  certain  elementary  operations  (called  trivial  lengthen¬ 
ings,  neutral  lengthenings,  and  E-rotations). 

Now  let  us  consider  the  problem  of  finding  admissible  generators.  It  turns 
out  that  we  have  three  distinct  cases,  which  in  fact  give  a  classification  of  covari¬ 
ances  into  three  distinct  types:  Let  us  define  the  index  of  non-Toeplitzness  of  R 
as  first  introduced  in  Friedlander  et  al  (1978)  to  be  (see  (1 14)  below) 

6  =  rank  [Rt,s  - 

It  is  easy  to  see  that  (a-6)  can  be  2.  1  or  0,  where  of  course  a  is  the  dis¬ 
placement  rank. 

Theorem:  Let  R-Z  R  Z'=GT.G*  where  G  is  a  minimal  generator.  Let 

G(z)  =  £  £7iz< 

t 

Then  for  every  minimal  generator,  if 

(i)  a-<5=2,  there  exists  a  X  such  that  G(z)A=l  and  A'£A=0.  An  example  is 
the  Toeplitz  case,  where  A=[l  -1], 

(ii)  o-5=l,  there  exists  a  X  such  that  G(z  )A=1,  but  X'EX^O 

(iii) a— 5=0,  there  exists  no  X  such  that  C(z)A=l.  However,  by  increasing  the 
dimension  to  (5+2),  we  can  always  find  a  G  that  is  admissible,  with  any 
desirable  X.  An  example  is  the  pre-windowed  least-squares  problem  of  Sec. 
3.4. 

Remark:  The  class  of  covariances  satisfying  a  -  <5  =  2  deserves  the  name 
qiuisistationary  or  close  to  stationary.  The  simplest  members  in  this  class 
are, indeed,  Toeplitz  covariances,  and  for  any  other  member  the  distance  from 
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stationarity  is  indicated  by  the  index  of  non-Toeplitzness  6  =  a  -  Z.  Notice  also 
that  the  surprising  term  X'EX  which  appeared  in  (113)  vanishes  for  covariances 
in  this  class. 

Construction  of  an  Admissible  Generator 

We  shall  give  a  method  of  constructing  an  admissible  generator,  though  it 
may  not  necessarily  be  minimal.  Let 

*0.0  *0.1  ■  •  ■ 

*  1,0 


R-ZRZ‘  = 

*v.o 

we  write  A  as, 

A  —  £,.vIc*r.v 

where  E0  has  dimension  <5x<5.  Now  define, 


*0.0 

|  0  •  •  •  0 

1  £ 

il 

* 

o 

o 

*1.0 

.  1 
* 

1  D\:N 

~ *1.0 

1  • 

j 

*W.  0 

— *,v.c 

E 

1  0 
=  0  E, 

where  Rio  =  R0}0/2Ria 


1°  0  "‘I 

Then,  it  is  easily  checked  that 

=  R  -  ZRZ' 


1  1 

4 

0 

6 

6 

-l 

0 
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and  that 


thus  making  G0  _v  an  admissible  generator,  with  X*  =  [1  0  0  — 1]. 

Remark:  Note  that  this  construction  matches  the  statement  in  the  theorem 
when  a  -  (5  =  Z  or  a  -  5  =  0.  It  is  however,  nonminimal  if  a  —  6  =  1. 

4.6  Concluding  Remarks 

We  have  given  an  outline  of  the  general  theory  of  lattice  filters  for  nonsta¬ 
tionary  processes,  emphasizing  how  the  displacement  structure  of  the  covari¬ 
ance  can  be  used  to  obtain  different  kinds  of  implementations.  We  find  it  fas¬ 
cinating  that  the  displacement  rank  concept  provides  such  technologically 
significant  generalizations  of  the  several  classical  results  of  Schur  and  Szego, 
obtained  without  any  idea  of  practical  applications. 
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